Skip to content

Details

Event Description:
Explore the latest in Natural Language Processing and Large Language Models at the Munich NLP in-person meetup. Join us on February 10, 2026 at TNG Technology Consulting. Doors open at 18:00 and the program starts at 18:30. We’ll feature two expert talks on cutting-edge LLM applications, followed by pizza and networking. All are welcome!

Agenda

  • 18:00 - Open Door
  • 18:30 - Welcome + Intro to MunichNLP and TNG Technology Consulting GmbH
  • 18:40 - AI Research @ TNG: How to process 20 billion tokens per day on OpenRouter - Henrik Klagges & Fabian Klemm & Daniel Klingmann
  • 19:20 - Break (5 min)
  • 19:25 - Benchmarking Memory in LLMs: Retrieval, Long Context, and Multi-Turn Interaction - Ali Modarressi
  • 20:05 - Pizza + Networking
  • 21:30 - Close

Talk abstracts (2 × 30 min + Q&A)

  1. AI Research @ TNG: How to process 20 billion tokens per day on OpenRouter
    Daniel Klingmann & Fabian Klemm & Henrik Klagges
    TNG combined efficient use of limited GPU resources with innovative approaches to construct high-performance child LLMs based on DeepSeek parent models. The talk outlines some of the approaches that worked, and some that did not, and how the new model variants differ. The practical relevance of the variants has been demonstrated empirically by the over 20 billion tokens processed by these models every day, sometimes reaching 105k requests per hour. The Chimera models, for example, resulted in TNG becoming one of Open Router's Top 10 open-source model creators.
  2. Benchmarking Memory in LLMs: Retrieval, Long Context, and Multi-Turn Interaction
    Ali Modarressi
    As LLM systems increasingly rely on retrieval, long contexts, and extended interaction, it becomes essential to benchmark how reliably they access and use information. This talk first presents controlled evidence that dense retrievers can exhibit systematic biases toward heuristic cues—favoring shorter documents, earlier mentions, repeated entities, or literal matches—sometimes ranking these above evidence that actually contains the answer. It then introduces implicit fact retrieval settings in which relevance depends on facts stated only implicitly in documents, requiring temporal, arithmetic, or world-knowledge inference despite superficially simple queries. Next, the talk examines long-context evaluation beyond literal matching, showing substantial performance degradation as context grows once lexical-overlap cues are removed. Finally, it covers dialogue-conditioned benchmarks for extended interactions that quantify drift and trade-offs among persona consistency, instruction following, and safety behavior over long conversations. The talk concludes by highlighting how these benchmarks can guide design decisions for memory-augmented LLM systems.

Bio Ali Modarressi
Ali Modarressi is a third-year PhD student at the Center for Information and Language Processing (CIS) at LMU Munich, supervised by Prof. Hinrich Schütze. Their current research focuses on memory-augmented large language models and, more broadly, long-context language modeling. They have also worked on interactive language generation and information extraction. Ali began their NLP research during their MSc under the supervision of Mohammad Taher Pilehvar, where they studied explainability methods and the interpretability of pre-trained language models—topics that remain relevant to their current work, particularly in analyzing retrieval models and knowledge probing.

Sponsors:
Thank you to TNG Technology Consulting GmbH for sponsoring and supporting the organization of this event.
Organizer:
Munich NLP

Related topics

Events in München, DE
Artificial Intelligence
Machine Learning
Natural Language Processing
Neural Networks
Python

You may also like